28 research outputs found

    High performance computation of landscape genomic models integrating local indices of spatial association

    Get PDF
    Since its introduction, landscape genomics has developed quickly with the increasing availability of both molecular and topo-climatic data. The current challenges of the field mainly involve processing large numbers of models and disentangling selection from demography. Several methods address the latter, either by estimating a neutral model from population structure or by inferring simultaneously environmental and demographic effects. Here we present SamÎČ\betaada, an integrated approach to study signatures of local adaptation, providing rapid processing of whole genome data and enabling assessment of spatial association using molecular markers. Specifically, candidate loci to adaptation are identified by automatically assessing genome-environment associations. In complement, measuring the Local Indicators of Spatial Association (LISA) for these candidate loci allows to detect whether similar genotypes tend to gather in space, which constitutes a useful indication of the possible kinship relationship between individuals. In this paper, we also analyze SNP data from Ugandan cattle to detect signatures of local adaptation with SamÎČ\betaada, BayEnv, LFMM and an outlier method (FDIST approach in Arlequin) and compare their results. SamÎČ\betaada is an open source software for Windows, Linux and MacOS X available at \url{http://lasig.epfl.ch/sambada}Comment: 1 figure in text, 1 figure in supplementary material The structure of the article was modified and some explanations were updated. The methods and results presented are the same as in the previous versio

    Subsampling as an economic consequence of using whole genome sequence data in landscape genomics: how to maximize environmental information from a reduced number of locations?

    Get PDF
    The recent availability of whole genome sequence (WGS) data implies to reconsider sampling strategies in landscape genomics for economic reasons. Indeed, while we had many individuals and few genetic markers ten years ago, we now face the contrary with high costs of WGS limiting the number of sequenced samples. In others words, molecular resolution is becoming excellent but it is achieved at the expense of spatial representativeness and statistic robustness. Therefore, when starting from a standard sampling, it is necessary to apply sub-sampling strategies in order to keep most of the environmental information. To study local adaptation of goats and sheep’s breeds in Morocco, we used a sampling design based on a regular grid overlaid on the territory. In each cell of this grid, 3 individuals were sampled in 3 different farms. Then, the final subset destined to sequencing had to meet two criteria in order to ensure a regular cover of both environmental and physical spaces. The first was met by using stratified sampling techniques over a range of climatic variables, previously filtered using a PCA. The second was by minimising a clustering index in order to ensure spatial spread. The sub-sampling procedure using a hierarchical clustering resulted in two datasets of 162 goats selected over 1283, and 162 sheep over 1412 based on variables such as temperature, pluviometry and solar radiation. By maximising the environmental information collected, we were able to select individuals that are the most relevant to study adaptation

    Geocomputational approaches for the analysis of Next-Generation Sequencing (NGS) and multi-scale data in landscape genomics

    Get PDF
    The application of geocomputation to the field of landscape genomics (Manel et al. 2010) permits to carry out demanding computational tasks that recently emerged because of the advent of large Next-Generation Sequencing data. When investigating the genetic mechanisms of evolution in spatially distributed plants or animals, geocomputation also proves to be useful to process many association models (gene x environment) in a multi-scale context

    Riding the whole-genome data tsunami: a landscape genomic study of local adaptation in Moroccan sheep and goats

    Get PDF
    In Morocco, like in other developing countries, small ruminants play an important role in the livelihood of a large proportion of farmers and landless shepherds. Conserving traditional breeds is essential in these countries since they are able to prosper in challenging habitats and their rich genomic resources allow them to adapt to new conditions. Therefore the key genetic features of local adaptation must be identified, notably with landscape genomic approaches, in order to support and encourage sustainable breeding of low-input livestock. To this end, the NEXTGEN project led an extended sampling campaign of local small ruminants to study local adaptation in Morocco. Over 2000 sheep and goats were sampled in small farms and flocks spread over the whole country. For each species, 164 samples were selected in order to reliably represent the environmental conditions while having an even spatial distribution. A landscape genomic approach was applied to detect selection signatures among 28 million SNPs in sheep and 19 million SNPs in goats. In summary, the habitat of each sample is characterised with environmental variables and significant genotype/environment associations point out the loci potentially under selection. Data were processed with Samßada, a specific landscape genomic software program. Preliminary results show that the method is able to process whole-genome sequence data. However the relatively low number of samples compared with the number of SNPs implies the existence of false positives among the most significant results. Measuring the spatial dependence between samples, as featured in Samßada, may facilitate their detection and interpretation. Thus combining whole-genome analysis with spatial statistics may lead to an integrated biogeoinformatic approach to study local adaptation

    SamBada in Uganda: landscape genomics study of traditional cattle breeds with a large SNP dataset

    Get PDF
    Since its introduction, landscape genomics has developed quickly with the increasing availability of both molecular and topo-climatic data. Current challenges involve processing large numbers of models and disentangling selection from demography. Several methods address the latter, either by estimating a neutral model from population structure or by inferring simultaneously environmental and demographic effects. Here we present Sam!ada, an integrated software for landscape genomic analysis of large datasets. This tool was developed in the framework of NextGen with the objective of characterising traditional Ugandan cattle breeds using single nucleotide polymorphisms (SNPs) data

    High performance computation of landscape genomic models including local indicators of spatial association

    Get PDF
    With the increasing availability of both molecular and topo-climatic data, the main challenges facing landscape genomics — i.e. the combination of landscape ecology with population genomics — include processing large numbers of models and distinguishing between selection and demographic processes (e.g. population structure). Several methods address the latter, either by estimating a null model of population history or by simultaneously inferring environmental and demographic effects. Here we present SamÎČada, an approach designed to study signatures of local adaptation, with special emphasis on high performance computing of large-scale genetic and environmental datasets. SamÎČada identifies candidate loci using genotype-environment associations while also incorporating multivariate analyses to assess the effect of many environmental predictor variables. This enables the inclusion of explanatory variables representing population structure into the models in order to lower the occurrences of spurious genotype-environment associations. In addition, SamÎČada calculates Local Indicators of Spatial Association (LISA) for candidate loci to provide information on whether similar genotypes tend to cluster in space, which constitutes a useful indication of the possible kinship between individuals. To test the usefulness of this approach, we carried out a simulation study and analysed a dataset from Ugandan cattle to detect signatures of local adaptation with SamÎČada, BayEnv, LFMM and an FST outlier method (FDIST approach in Arlequin) and compare their results. SamÎČada — an open source software for Windows, Linux and Mac OS X available at http://lasig.epfl.ch/sambada — outperforms other approaches and better suits whole genome sequence data processing

    The CIP2A–TOPBP1 axis safeguards chromosome stability and is a synthetic lethal target for BRCA-mutated cancer

    Full text link
    BRCA1/2-mutated cancer cells adapt to the genome instability caused by their deficiency in homologous recombination (HR). Identification of these adaptive mechanisms may provide therapeutic strategies to target tumors caused by the loss of these genes. In the present study, we report genome-scale CRISPR-Cas9 synthetic lethality screens in isogenic pairs of BRCA1- and BRCA2-deficient cells and identify CIP2A as an essential gene in BRCA1- and BRCA2-mutated cells. CIP2A is cytoplasmic in interphase but, in mitosis, accumulates at DNA lesions as part of a complex with TOPBP1, a multifunctional genome stability factor. Unlike PARP inhibition, CIP2A deficiency does not cause accumulation of replication-associated DNA lesions that require HR for their repair. In BRCA-deficient cells, the CIP2A-TOPBP1 complex prevents lethal mis-segregation of acentric chromosomes that arises from impaired DNA synthesis. Finally, physical disruption of the CIP2A-TOPBP1 complex is highly deleterious in BRCA-deficient tumors, indicating that CIP2A represents an attractive synthetic lethal therapeutic target for BRCA1- and BRCA2-mutated cancers

    Analysis of B. taurus and B. indicus admixture in Uganda as revealed by the Illumina BovineSNP50 Genotyping BeadChip

    Get PDF
    The NextGen project investigates disease resistance in indigenous Ugandan cattle. Since population structure and stratification may produce biased results, We have investigated the genomic structure of sampled animals genotyped with the BovineSNP50 Genotyping Beadchip. A total of 788 animals from 9 populations belonging to Ankole (crossbred between B. indicus and B. taurus), Zebu and Ankole-Zebu crosses have been sampled in 52 grid cells throughout the country (Table 1). We merged this data whit other 400 Italian Holstein Cattle, genotyped in the framework of SELMOL project to seek for a likely introgression of European B. taurus. The data were filtered with the following exclusion criteria: MAF < 0.01, genotype call rate (SNPs) < 0.95, genotype call rate (Animals) < 0.95. The resulting working dataset were composed of 43494 SNPs and 1188 animals. Hidden genetic structures were investigated by a Bayesian clustering approach with the ADMIXTURE software (Novembre et al. 2010). The software Admixture identified four ancestral genomic components. Three of them likely correspond to European taurine, African indicine and African taurine components (Figure 2). The fourth has a still unidentified origin (Yellow, Figure 2d). Most Ugandan individuals investigated have a remarkable level of admixture. Overall, about 20% of the Zebu genome is of African taurine origin, confirming previous data on the foundation of African Zebu. The European taurine (Blue, Figure 2) is a minor component of African genomes, rare in Zebu and evenly distributed in Ankole, other taurine subgroups and Ankole- Zebu crosses. Indicine and taurine components show a clear geographical structure, the former being predominant in north-eastern Uganda, and the latter in the south-west. Holstein Fresian introgression is spread mostly in south-western Uganda, while the fourth component is located in restricted geographical area in the East (Figure 3). The Ugandan cattle population is a complex admixture of African taurine (green in Figure 2) and zebuine (red) genomes, with a minor component of European origin (blue) and a rare but relevant contribution (yellow) from a still unidentified source. This complexity is to be accounted for in the following GWAS and selection signatures analyses planned within the NextGen project
    corecore